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Abstract 

It is proposed that to the usual probability theory, three definitions 
and a new theorem are added, the resulting theory allows one to displace 
the central role usually given to the notion of conditional probability. 
When a mapping (p is defined between two measurable spaces, to each 
measure \x introduced on the first space, there corresponds an image ip[jj] 
on the second space, and, reciprocally, to each measure v defined on the 
second space the corresponds a reciprocal image tp' 1 [v\ on the first space. 
As the intersection n of two measures is easy to introduce, a relation like 
ip[n H v? -1 ^]] = if[fj] n v makes sense. It is, indeed, a theorem of the 
theory. This theorem gives mathematical consistency to inferences drawn 
from physical measurements. 
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1 Preliminary 



Assume given a measurable spac^](f2, T) , and a measure[j/i on that 
is a- finite^ Then, is called a cr-finite measure space. Let v be a 

second measure on (^,.7-") . The following assertions — a symmetric version of 
the Radon-Nikodym theorem — are equivalent (Schilling, 2006). 

• The measure v is absolutely continuou^] with respect to /i . 

• There is a /i-almost everywhere unique function from into [0,oo) , de- 
noted dv/d^i, such that 



The function dv/d/j, is called the Radon-Nikodym density associated with v 
by /i , or the Radon-Nikodym derivative of v with respect to /i . 

If a (j- finite measure /j, is such that fJ,[£l] — 1 , one says that /i is a probability 
measure, and the measure [i{F] of some set F e T is then called the probability 
of the sei0 F . 

2 Definitions and properties 

2.1 Intersection of measures 

Given a cr-finite measure space T , [i) , consider two cr-finite measures v\ and 
vi , at least one of them — say v\ — being absolutely continuous with respect 
to the base measure /i . 

As usual, here Q denotes a set and T is a collection of subsets of f2 that is a cr-field (i.e., 
T is nonempty and it is closed under complementation and countable unions of its members) . 

2 A (positive) measure (measures are implicitly assumed to be positive) is a function fj, : 
T i— > [0, oo] satisfying two properties: (i) the measure of the empty set is zero, and (ra) the 
measure of the union of a countable sequence of pairwise disjoint sets in T equals the sum of 
the measures of each set. 

3 A (positive) measure defined on a cr-algebra T of subsets of a set Q is called a-finite if 
Q is the countable union of measurable sets of finite /^-measure. A set in a measure space has 
cr-finite measure if it is a countable union of sets with finite measure. A measure fi is called 
finite if ' s a finite real number (rather than oo). A cr-finite measure may not be finite 

(the Lebesgue measure on the real line is cr-finite, but not finite). An example of a measure 
on the real line that is not cr-finite is the counting measure (the counting measure of a set 
of real numbers is the number of elements in the set): every set with finite measure contains 
only finitely many real numbers, and it would take uncountably many such sets to cover the 
entire real line. The Radon-Nikodym theorem does not apply to the counting measure and no 
density can be associated to it. It is to avoid such "pathological" measures that the cr-finite 
hypothesis is introduced. 

4 The measure v is said to be absolutely continuous with respect to the measure fj, if 
fi[F] = v[F] = . One writes v < /J, . 

5 When dealing with probability measures only, sets are often called events. 




for every F G T 



(1) 
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Definition: Given some finite constant n , the intersection of the two 
measures v\ and v% , is the measure denoted v\ n v% and defined as 



1 f dv 

[ui n /^2)[-P 1 ] = / —r~ di*2 for every F 6 T 
n J F dp 



(2) 



It is obvious that this defines a measure and that — by virtue of the Radon- 
Nikodym theorem — it is absolutely continuous w.r.t. i>2 ■ The operation D 
depends on the base measure p , so, when necessary, a more explicit notation, 
like , can be used. 

Remark: When dealing with arbitrary measures, one may well take n = 1 , 
while when dealing with probability measures, it may be more convenient to 
take n = J n (dvi/dp) dv2 as, then, (y\ H 1^2 ) [O] = 1, this implying that the 
intersection of two probability measures is a probability measure (but, then, the 
intersection would only be defined if < n < oo ) . 

Should the measure v?, also be absolutely continuous with respect to p , 
equation ^ could be written {y\ n ^[-F 1 ] = „ If ^7 ckt ^ > ^ ne measure 
vi H V2 would also be absolutely continuous with respect to p , and its Radon- 
Nikodym density would be 

djyi H vi) _ 1 dvi_ dv2 ,^ 
dpi n dp dp 

Comment: The term intersection is justified in section |3.2[ when the inter- 
section of (measurable) sets is found as a special instance of the intersection of 
measures. Another special instance of the intersection of (probability) measures 



corresponds to the notion of conditional probability (see section 3.3 1 



2.2 Reciprocal image of a measure 

Let (X, £) and (Y, T) be two measurable spaces, and p> : X i— ► Y a measur- 
able]^] mapping. Two measures p and v are introduced (to be considered as 
base measures) such that (X,£,p) and (Y^T, v) are a-finite measure spaces. 

Definition: Given some finite constant n , to every measure r on (Y, J-) 
that is absolutely continuous with respect to v , is associated a measure on 
(X,£) , called the reciprocal image of t , denoted <^ _1 [t] , and defined via 

d ^ 1[T]) = - (*Lo<p) . (4) 
dp n \dv ) 

Then, for every E € ^[T] C £ , one haQ (^[r])^] = ± f E %{tp{x)) 
dv[x) , and the Radon-Nikodym theorem ensures that tp~ [r] is, indeed, a mea- 
sure. As this reciprocal image depends on the two base measures, the more 
explicit notation <yS _1 [T; p, v] can be used. 

6 The mapping ip is measurable, if the reciprocal image of every set in T is in £ . Non 
measurable mappings are generally considered pathological. 

7 As ip is a measurable mapping, and the function dr/du is measurable (with respect to 
!F), the function (dr/dv) o ip is measurable (with respect to i^r 1 ^] C £). (See, e.g., Halmos, 
1950). 
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Remark: When dealing with arbitrary measures, one may well take n = 1 , 
while when dealing with probability measures, it may be more convenient to 
take n — f x ^oipd/j,,as, then, ((^ _1 [r])[X] = 1, this implying that the 
reciprocal image of a probability measure is a probability measure (but, then, 
the reciprocal image would only be defined if < n < oo ) . 



2.3 Image of a measure 

Let (X, £) and ( Y, F) be two measurable spaces, and <p : X i— > Y a measurable 
mapping. 

Definition: To every measure it on (X, £) , is associated a measur^on 
(YfJ 7 ) , denoted (p[n] , called the image measure." 



(5) 



i.e., explicitly, (ip[ir))[F] — n[(p~ \F\] for every F € T . 

The measure <p[tt] needs not b^] absolutely continuous with respect to some 
base measure, so <$\k\ may not be representable by a bona-fide densitjj^] This 
does not cause any complication in the applications we have in mind. 

We shall later need the following property (Halmos, 1950, page 163): for any 
measurable function K and any set F e T , 

K d(w o ip' 1 ) = K o <p dir , (6) 

i.e., J F K(y) d(<p[ic])(y) = $ v _ 1[F] K(ip(x)) d%(x) . 

Comment: To have an intuitive idea of the notion "image of a measure", 
consider a collection of elements x\, xi, X3, . . . of X that are independent sam- 
ple elements of the measure it . Then, it is easy to see that the elements 
ip(xi), <p{x2), ^(x^), ... of Y are independent sample elements of the measure 
(p[n] . In fact, this property alone may suggest introducing the notion of an 
image of a measure. 



8 It is not difficult to verify that is, indeed, a measure. First, the measure of the 
empty set is zero, because, by definition of the reciprocal mapping, ip' 1 [ ] = , so (<p[ir] ) [ ] = 
7r[ (fi' 1 [ ] ] = 7r[0 ] = . Second, we have to check that if Fi,Fa,.,. is a countable sequence 
of pairwise disjoint sets in T the measure of the union of all the Fi is equal to the sum 
of the measures of each Fi : (vMMUs ^i] = ^Zi 1 "!-^] ■ First, by definition of image of a 
measure, one has (y[7r])[ [J. F} ] = n[ tp' 1 [ |J. Fi ] ] and, as the reciprocal image of a union is 
the union of the reciprocal images, (yMMUi-^i] = n i Ui V 1 [Fi] ] ■ But 7r is a measure, and 
the reciprocal image of disjoint sets is disjoint, so (v[7r])[ [L Fi ] = 7r[ (fi' 1 [Fi] ] . Finally, 
using again the definition of image of a measure, this leads to desired property. We have thus 
checked that the image of a measure is a measure. 

9 As an example, this happens when X = 5R P and Y = 5R 9 with p < q , and a continuous 
mapping tp (with the standard Lebesgue measures assumed), because then <p[X] is a p- 
dimensional submanifold of SR 9 . 

10 When ip[n] is representable by a density, it is, in general, easy to find an expression of it, 
but the (el eme ntar y) m ethods to be used are quite different in every situation (see examples 
10, 



in sections 4.2 and|4.3|l, and a general expression for the density is not available. 
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2.4 Compatibility property 



Let (X, £, fx) and (Y, T, v) be two a- finite measure spaces, and ip : X i— » Y be 
a measurable mapping. Let tt be a measure over (X, £ ) that is cr-finite, and r 
a measure over (Y,.? 7 ) , that is absolutely continuous with respect to the base 
measure v . 



Theorem: One always has 















where < 




ip[% n 7r'] 


= r'nr 






r' = </j[7t] . 



(7) 



Note that while the measure r is assumed to be absolutely continuous with 
respect to the base measure v , the measures ip[ir] and </?[7rn7r'] may be singular. 

To demonstrate the identity in equation Q means to verify that for any set 
F £ T , one has ( (p[ir l~l tt' ] ) [F] = ( r' n r ) [F] . This is done by writing the 
following sequence of identities (that successively use equations ^ , ([2]) , Q , ([6]) , 
and ^ again): 

((P[ttQ V5 _1 [t; /i, v\ ] )[F] = ( tt Gl ^ _1 [r; /x, j/] )[^ _1 [F] ] 

1_ /" d(y 1 [r;ju ,i/]) i 

--•It' 

n' J F av 
1 

n" 



(x) d7r(x) 

(y) d(^[7r])(y) 
(if{n]ar)[F] , 



(8) 



so the property holdfj^j] 



3 Measures and sets 
3.1 Measure-sets 

The definitions and properties above have a direct relation with definitions and 
properties in set theory, and, in some sense, they generalize them. To see this, 
let us start by introducing the notion of measure-set. 

Let n) be a cr-finite measure space. To every set A £ T we shall 

associate a measure, denoted \la , and defined via the condition 

fx A [F] = — n[A n F] for every F £ T , (9) 
n A 

11 The constant n" equals one, because for general measures the two constants in equa- 
tions and |2j| should be taken equal to one, while for probability measures, there is an 
automatic renormalization. 
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where ua is a suitable chosen constant (that may depend on A but not on F ) . 
As suggested above, one may well take 

ua = 1 for arbitrary measures , (10) 

or 

tia = n[A] for probability measures , (11) 

because, then, /^[fi] = 1 (of course, this assumes fj,[A] 0.) Such a measure 
shall be called a measure-set, so we can talk about the measure-set fiA associ- 
ated with a set A by a measure /z . The /Lt-density associated with a measure-set 
[iA is clearljj^J ~ xa , i- e - ; proportional to the characteristic function of the 
set A. 

Of course, there may be subsets of f2 that are not in T , but, as far as one is 
only interested in the sets in F , one can consider that any measure absolutely 
continuous w.r.t. \i is something like a generalized set: while a measure-set can 
be identified to a set (its density taking only two possible values, or 1/ua), 
an arbitrary measure (with a density taking any nonnegative value) is a kind 
of generalized object, that contains measure-sets and, therefore, sets as special 
cases. 

The names given to the three notions introduced above — intersection of 
measures, and image and reciprocal image of a measure — are justified because 
(as we are about to see) when applied to measure-sets they do correspond to 
the intersection, the image and the reciprocal image of sets. 



3.2 Measures versus sets: intersection 

If A and B are two sets in T and \xa and hb are the two associated measure- 
sets, one haj^] 

Wn/iB = knAnB (12) 

with the constantly k = UAnB . So — in the special case where the measures 
are measure-sets— ^""the definition of intersection of measures is consistent with 
the definition of intersection of sets. 



3.3 Intersection of measures and conditional probability 

Letting A be a fixed set of F, let us now consider the intersection of an 
arbitrary measure v and the measure-set [ia , i-e., the measure v n (ma ■ One 
haiJED 

(unfi A )[F] = - v\F n A] for every FeT , (13) 
n 



12 As, for every F6f , fp XA dfj, = ^ Iadf d ^ = ^ ^ A ° F) = ^ a[F] ' 
13 For any F £ F, (» A n fl B )[F] = J F XAX B dfi = ^^! FnAnB d^ 

J F XAnBd[i- l*AnB[*\. 

14 For general measures, k = 1 , while for probability measures, k = ~pjr~rgi • 

15 For (u n ti A )[F] <x f p X A du = f FnA dv = u[F n A] 
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where n is a constant. This is particularly interesting when dealing with prob- 
ability measures, because, then, n = v[A\ , and one has 

{vnfx A )[F} = V[F ^ A] for every F e T . (14) 

One immediately recognizes there the expression of Kolmogorov's conditional 
probability, usually denoted v\F n A\/v\A\ — v[F \ A] . Using this notation, 

(u R /J,a)[F] = v[F\A] for every F € T . (15) 

So we have the following 

Property: For every given probability measure v , Kolmogorov's condi- 
tional probability, given any set A , is identical to the intersection of v by the 
measure-set fi A ■ 

So we see that the notion of conditional probability is a special case of 
the notion of intersection of measures: when evaluating the intersection of an 
arbitrary measure by a measure set, we have the conditional probability, but 
we can evaluate the intersection of two general measures. I claim that there 
are problems that are naturally formulated in terms of the intersection of two 



measures (see sections 4.1 and 4.5 for examples). As this notion has not been 
available so far, some hand-waving has been necessary to make this kind of 
problems fit into the available mathematical structure. This, plus the fact that 
general mappings between arbitrary sets (as opposed to linear mappings between 
linear spaces) can be taken as root elements, is what has motivated the building 
of the present theory. 



3.4 Measures versus sets: reciprocal images 

When considering a mapping ip from a set X into a set Y to every set B C Y 
there is associated a subset of X , denoted and named the reciprocal 

imagcp'1 of the set B . But the reciprocal image of a set can also be defined in 
terms of the characteristic functions of the sets: letting £4 the characteristic 
function of a set A Q X and \b that of a set B C Y , one clearly has 

V 1 ^] = Xb°¥ , (16) 

a relation that is identical (with n = 1 ) to the relation Q expressing the 
reciprocal image of a measure in terms of Radon-Nikodym densities. So, as it 
already happened with the intersection of measures, the notion of reciprocal 
image of a measure is consistent with the definition of reciprocal image of a set: 
the set associated to the reciprocal image of the measure-set that is associated 
to a set B is the reciprocal image of the set B : 

tp' l [v B ] = /VMS] • ( 17 ) 

In this sense, again, the notion of reciprocal image of a set is "contained" in the 
notion of reciprocal image of a measure. 

16 The set tp' 1 [B] is made of all the elements of X whose image is in B . 
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3.5 Measures versus sets: images 



The relation between the notion of image of a set (in set theory) and the notion 
of image of a measure is subtle. In this short note, let us just mention that the 
supporip 7 ] of the image of a measure d{y>[ii\)/dv is the image (in the sense of 
set theory) of the support of the original measure dn/dfi . 



3.6 Measures versus sets: compatibility property 

In set theory, for arbitrary sets A\ and A 2 and for an arbitrary mapping ip , 
one has 

<p[A 1 nA 2 ]Ctp[A 1 ]n<p[A 2 ] , (18) 

a relation that is well-known but not very useful here. A more useful relation 
(for making inferences involving sets and mappings) is that, for arbitrary sets 
A and B , and an arbitrary mapping (p , one has 

ifiAfMp-^B]] = ip[A]r\B . (19) 



For reasons that shall become clear in the applications (see section |4.5| , it is 
interesting to extend this identity into probability theory (or, more generally, in- 
side measure theory). But, of course, our compatibility property (equation (17])) 



tp[ir R ^[t]] = <^[7r]nr , (20) 



is identical to relation (19 1, excepted that it concerns measures instead of sets. 
So, in some sense, we have generalized the set relation. In any case, when 
the relation (p[ir n </3 -1 [t] ] = tp[ir] n r is applied to measure sets, it becomes 
relation ( 19 1. 



4 Applications 

4.1 Intersection of probability measures 

Let O represent the surface of the sphere of unit radius, and B the usual Borcl 
collection of subset^] Consider, on the measurable space (S, B) , the ordinary 
surface measure: for any set B £ B of points on the sphere, S[B] is the surface 
of B . Two probability measures Pi and P 2 are then considered, and two 
points Ti and 7 2 are randomly created/^] on the surface of the sphere, that 
are random point samples of the respective probability measures Pi and P 2 ■ If 
CPi 7^ y 2 the two points are discarded, and two new points are generated. And 
so on until the two points happen to be identical, 7i — T 2 = 7 . 

Question: of which probability measure is 7 a random point sample? 

17 The support of a function is the set of points where the function is not zero. 
18 The Borel sigma-field is defined as the smallest sigma-field containing all the open subsets. 
19 The notion of random point is not introduced here; it is assumed that reader knows the 
basic notion of sampling from a probability measure. 
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Answer: point 7 is a random point sample of the probability measure 



P = P 1 DP 2 . (21) 

Proof: The probability that the two points Ti and 7 2 happen to be identi- 
cal is zero, so the question makes no immediate sense, and needs to be slightly 
reformulated. If the sphere is assumed to be tiled with a finite collection of 
(spherical) tiles of identical surface AS , then it can happen that the two points 
Ti and y 2 are in the same tile. The finite probability that this happens in 
a given tile can then be evaluated (it is the renormalized product of the two 
probabilities assigned to the tile by each of the two probability measures Pi 
and Pi ), and it is when taking the limit AS — > that one gets the result. 

Introducing the three probability densities fx , f 2 , and f± n f 2 associated 
with the three probability measures Pi , P 2 , and Pi n P 2 via 



(An/a) dS , (22) 



Pi[B] = I hdS ; P 2 [B] = [ f 2 dS ; (PiDP 2 )[B] = f ( 
Jb Jb Jb 

gives here, using equation ([3]), 

(/in / a ) = f flf * . (23) 
Jo /1 / 2 dS* 

To pass from this purely mathematical exercise to a problem involving real- 
life measurements, assume that two totally "disentangled"^] measurements of 
the position of a floating object on the ocean provide the information described 
(following ISO's recommendations [ISO, 1993]) by two probability densities /1 
and f 2 . How should they be "combined" to represent the total available in- 
formation? The detailed justifications of this is outside the scope of this short 
note, but I suggest here that experimental uncertainties are defined in such a way 



(ISO's way) that the answer to the question is precisely that in equation (23 1 



4.2 Mapping between discrete sets 

Let X and Y be discrete spaces, £ and T the respective collections of their 
subsets, and tp a mapping from X into Y . Consider that a probability mea- 
sure 7r on (X, £) , is sampled, this providing elements xi,x 2l ... of X , and 
therefore, via the mapping if , the image elements y\ — <f(x±) , y 2 = <f(x 2 ) , . . . 
of Y . Of which probability measure r on (Y,J-) are the elements yi,y 2 ,.. . 
sample points? 

The answer is t = ip[n] , as this clearly corresponds to the very definition of 
image of a measure (equation ([5])): 

t[F] = nlip-^F}} for every F e T . (24) 

20 I am trying here to avoid the use of the term independent that has a related — but 
different — connotation in probability theory. 
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To transform this result into an explicit expression, we can introduce two 
base measures \i and v on (X, £) and (Y,F) respectively, for instance, the 
respective counting measure^ 21 [ The density / associated with the measure 7r 
consists then in the (discrete) collection of numbers fa — f(x{) such that, for 
every set E C X , ir{E] — f E fdfx — J2 X <ee f( x i) > while the density g (that 
we may denote g = <p[f] ) associated to the measure r = (p[w] consists in the 
(discrete) collection of numbers g a = g(y a ) such that for every set F C Y 
t[F] = J F gdv = £F g{y a ) . Some easy computations then provide the 
solution: 

g(yct) = f( Xi ) for every Va e Y ■ ( 25 ) 

^i6v _1 [{yc}] 



4.3 Propagation of uncertainties in physical measurements 

Physical quantities are often defined in terms of other physical quantities. For 
instance, the electric resistance R of a wire is defined as the ratio of the voltage 
V applied to the wire and the current intensity / flowing in the wire. Then, 
a typical measure of R involves, in fact, the measure of the two quantities V 
and I and the computation of the ratio R = V/I . 

So, more generally, when one wants to perform a physical measurement of 
the value of some physical quantity, say y , most of the time, one resorts to 
measuring in fact some other quantities, say {x , x , . . . ,x p } , and then one 
computes the value of y via its definition 

y = v(x\x 2 ,...,x p ) . (26) 

One very basic problem in metrology is that of "propagating" the uncertainties 
appearing in the measurements of the quantities x l into the uncertainty on the 
quantity y . Good metrology practice corresponds (ISO, 1993) to representing 
the uncertainties on a measurement by a probability density (as opposed to 
simple "uncertainty bar"). Therefore, one faces the following problem: 

Question: One has some probability measure it defined on the quantities 
{a; 1 , x 2 , . . . , x p } , and one defines the quantity y via the mapping in equa- 
tion ( 26 1 . What probability measure r does this imply on the quantity y ? 



Short answer: The probability measure t is the image of the probability 
measure 7r , i.e., according to our general definition of image of a measure: 
r = tp[ir] . 

But let us state the problem using a more general terminology. 

Preliminaires: Some of the quantities x = {x 1 , x 2 , . . . , x p } may be discrete, 
while others may be real quantities, each taking values inside some interval (open 
or closed). Let X be the set (part discrete, part continuous) whose elements 



The counting measure of a set is the number of elements in the set. 
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correspond to all the possible values of the quantities x . Introducing an ap- 
propriate CT-field of subsets of X is, generally, quite easjj^] so one immediately 
faces a measurable space (X, £) . We can consider, for more generality, that 
the quantity y also is "multidimensional" : y = {y 1 , y 2 , . . . , y q } . A measurable 
space (YjJ 7 ) is introduced as above. Unless that mapping x i— > y = tp(x) is 
pathological 2 ^] it will be measurable (with respect the two er-fields E and T ) . 
The (uncertain) result of the measurement of the quantities x is represented 
by a cr-finitep^j measure 7r on (X, £) . 

Question: How do the uncertainties encapsulated by the measure ir "prop- 
agate" into uncertainties on the space (Y, T) , i.e., which is the measure, say r , 
implied on (Y, T) by the measure 7r and the mapping ip ? 

Answer: The notion of "propagation of uncertainties" can be made precise 
by imposing that the probability t[B] of any subset B £ T must equal the 
probability of the pre-image (or reciprocal image) of the subset: 

t[B] = nl^lB]] for any B e 8 . (27) 

But this is exactly our definition of image of a measure, so the answer is 

t = ip[n} . (28) 



Example: The measurement of an electric resistance R involves the mea- 
surement of the two quantities V and / and the use of the definition R = V/I . 
If the result of the measurement of V and / (and the associated uncertainties) 
is that represented by the (lognormal) probability density 

then, the notion of image of a measure produces, for the electric resistance R , 
the (lognormal) probability density 

1 1 / log 2 (i?/i? ) 



where 



V2 7T (TR R \ ^ a R 



R = Vo/Io and a R = J g\ + aj . (31) 



Let us see some details of that. The space X is (0, oo) x (0, oo) C 5R 2 , that 
we endow with two coordinates {V, 1} (having the physical interpretation of an 



22 A Cartesian product of some Borcl fields — for the real variables — times the collections 
of all the possible subsets — for the discrete variables — . 

23 Physicists need to try hard before being able to introduce mappings that are not mea- 
surable with respect to the obvious topologies. 

24 Physicists will typically represent their measurement uncertainties by introducing proba- 
bility densities and discrete probabilities, to be interpreted as the Radon-Nikodym derivatives 
of the measure it , so tt will be tr-finite by construction. 
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electric voltage and an electric intensity). The space Y is (0, oo) C 3? that we 
endow with a coordinate R (having the physical interpretation of an electric 
resistance). The mapping tp is (definition of electric resistance) R = V/I. 
The usual Borel er-fields of X and of Y (say B 2 and B\ ) are introduced, 
and the usual Lebesgue measures are considered as base measures. To arrive 
at the density g(R) one can here introduce the "slack" variable P = V I . this 
allowing to consider the "change of variables" { V, 1} >— > {R, P} . One then easily 
evaluates the density go(R, P) (using the Jacobian of the transformation), and, 
from it, g(R) = J °° go(R, P) dP . It can be shown that the final result for g(R) 
is independent on the particular choice of slack variable. 



4.4 Interpretation of observations (1: using sets) 

In the physical sciences, some problems of interpretation of observations can be 
idealized as follows. There are two sets X and Y , a mapping ip from X into 
Y , and 

(i) we are interested in identifying a particular element x G X , and we have 
the "a priori information" that it belongs to a subset X pr j or C X : 

X G ^prior : («^2) 

(ii) we have "observed" that some element y E Y belongs to a subset Yobs Q Y : 

y e Fobs , (33) 

and (in) we know that y is related to x via the mapping tp : 

y = <p(x) . (34) 

These three pieces of information, when put together (see the left of figure [lj, 
allow one to infer (using standard set theory reasoning): 

(i) that the element x belongs, in fact, to a set X post that is smaller or equal 
to the original set X pr ; or , 

G ^"post ^prior n^ 1 !^] c x pTioI , (35) 

(ii) while the clement y belongs, in fact, to a set Yp OS t that is smaller or equal 
to the original set Yobs , 

y e Y post = <^[x prior ] n Y ohs c r obs . (36) 



These two results are obvious. Perhaps less obvious is the relation 



Y 



post 



<p[X. 



postj 



(37) 



that follows directly from the universal set property ^[An^" l [B]] = yj[A]DB 
(equation ( 19 1). 
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Remark that we are inside the paradigm typical of a "problem of assimi- 
lation of observations" — sometimes called "inverse modeling problem" — : the 
mapping x i— ► y = ip(x) can be seen as the typical mapping between the "model 
parameter space" and the "observable parameter space" . In what concerns the 
element x £ X we pass from the "a priori information" x £ X prior to the "a 
posteriori information" x £ X post C X pr j r . Similarly, in what concerns the 
element y £ Y we pass from the "initial observation" y £ Y" b s to the "refined 
observation" y £ Y post C Y ohs . 

In the next section, the same problem is reformulated, but using probability 
measures instead of sets. 



Xprior 

X 




Yobs^^ 

Y 


X Xprior 




y G Yobs 


Cp-^Yobs] 




Cp [Xprior] 


X 




Y 


X G cp-l[Yobs] 




y G Cp[Xprior] 


Xpost = 
Xprior Pi tp _1 [ Yobs] 




Ypost = 
Cp[ Xprior] Pi Yobs 


X 




Y 


X £E Xpost 




y G Ypost 


Xpost 




Ypost 


X 




Y 



Ypost = Cp [Xpost] 



Jtprior 

;:i ::: 




Tobs 

f||; 


X 




Y 


X ~ Jtprior 




1/ ~ Tobs 


Cp _1 [Tobs] 






Cp[ Jtprior] 






y 


X ~ Cp _1 [Tobs] 




y ~ Cp[jtprior] 


Jtpost = 
Jtprior PI Cp" 1 [Tobs] 

s 

X 




Tpost = 
Cp[ Jtprior] Pi Tobs 

* 

Y 


X — ' Jtpost 




y — ' xpost 


Jtpost 




Tpost 


X 




Y 



Tpost — Cp[jtpost] 



Figure 1: At the left, an inference problem that can be solved using only set 
theory (see text). At the right, a similar problem, but this one concerning 
probability measures (see text). This example at the right corresponds to many 
of the so-called inverse problems in the experimental sciences. 
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4.5 Interpretation of observations (2: using measures) 



The problem of interpretation of observation — sometimes called the "inverse 
problem"- appears as follows. A physical system (e.g., a molecule, an ocean, 
a planet's atmosphere, a galaxy) is under investigation. For the purposes of the 
investigation, the system is described using a collection of p physical quantities 
x = {x 1 ,x 2 , . . . ,x p } ; some of them taking only discrete values (for instance, 
black or white) and some others taking continuous values (for instance a tem- 
perature can take any positive real value). A set X is introduced, the elements 
(or "points") of which corresponding to the quantities x taking all their possible 
values. In the jargon of inverse problem theory, the set X is called the model 
space^\ In order to gain information on the p physical quantities x , a set of q 
physical quantities y = {y 1 , y 2 , . . . , y q } — perhaps only quite indirectly related 
to the quantities x — is measured. As above, when considering all possible 
values for the q quantities y one is faced with a set Y , the "observable param- 
eter space" . One then (usually implicitly) considers two collections of subsets 
£ and T , of sets of X and of Y respectively, that are cr-algebras, so one has 
two measurable spaces (X,£) and {Y,T). On each of these spaces, a base 
(a-finite) measure has to be considered in order to have two cr-finite measure 
spaces, (X,£, //) , and (Y, T , v) . To a mathematician, the existence of the two 
base measures \x and v may seem a minor hypothesis. A physicist may have 
to work hard to find them, as they must represent the volume measure of each 
space, and, as such, they must have the necessary invariance^j For this reason, 
let us here call the two base measures fi and v the respective volume measures. 
They matter, because the reciprocal image of a probability measure on (Y, T) 
and the intersection of measures on (X, £) and on (Y, T) depend on them. 

The final structure clement is that a physical theory is assumed to exist, that 
is able — given any possible value of the model parameters x — to predict (in 
a Popperian sense) the observations y . This prediction consists in a mapping 
ip : X Y , that must be assumed to be measurable (what, for a physicist, just 
means that ip is assumed to be not "pathological"). Of course, the mapping <p 
is not assumed to be invertible (and it may be "nonlinear"). 

In a typical inverse problem one cares in introducing any available a priori 
information on x (that means information available before the measurements 
on y are carried out) as a probability measure, say 7r pr ior , on (X,£) (it must 
be "ordinary", i.e., cr-finite, but it does not need to be absolutely continuous 
w.r.t. the volume measure fi). When the measurements of the quantities y arc 

25 In fact, the set X is something more abstract: any (invertible) change of variables x ^ x' 
is to be seen as a "change of coordinates" inside X , not as the definition of a new set X' . 
For a discussion of this kind of intrinsic view on physical quantities, see Tarantola (2006). 

26 For instance, one of the quantities may be the period T (of, say, a star). How to 
measure the volume (in fact, the length) of an interval (T\,T2) , say /i[(Ti,T2)]? Taking 
/4(Ti,T 2 )] = |T 2 — Ti| would not be consistent with, when working with the frequency 
ui = 2ir/T , measuring the volume as \l*J2 — wi \ , because |T2 — Ti| ^ \lu2 — 1^1 1 ■ In fact, the 
right volume measure is ^[(Ti,T 2 )] = | log(T 2 /T!)| , because |log(T 2 /Ti)| = | log(w 2 /^i)| ■ 
See Tarantola (2005) for an elementary discussion of this problem, or Tarantola (2006) for a 
more advanced discussion. 
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carried out, the result is representee^] as a probability measure, say r b s , on 
(Y, J 7 ) , that must be absolutely continuous w.r.t. the volume measure v (i.e., it 
has to be representable by a density). So, one has the following three elements 
(see the right of figure [lj : 

(i) a priori information on the model parameters, i.e., a probability measure on 
(X,S) 

TTprior 5 (38) 

(ii) results of the measurements, i.e., a probability measure on (Y,!F) 

Tobs ; (39) 

and ( Hi ) the modeling mapping 

Lp: X^Y . (40) 

It is clear that the existence of a mapping tp is going to transform the mea- 
sure 7r pr i or into some other measure, say 7r post , and the measure r Q b s into some 
measure, say r post , much as it happened in section [4~4| where the prior sets were 
transformed into posterior sets. The problem is that here we are facing natu- 
ral objects (measurements and physical laws), that are not easily amenable to 
axiomatization. Usual presentations of the inverse problem unconvincingly use 
intuitive interpretations of the notion of conditional probability and of, perhaps, 
Bayes' theorem. I prefer here to frankly state that I formulate the problem using 
only the analogy between the present problem and the set-theoretical problem 
(although closer analogies can be elaborated""} . The results there 



4.4 



in section 

(that concerned intersection of sets, and images and reciprocal images of sets) 
were unquestionable. As far as the present theory, defining the intersection of 
measures, and images and reciprocal images of measures is an acceptable gen- 
eralization of (a part of) set theory (and the compatibility property suggests 
that it is) , we can match this problem to the one in section |4.4| (see also the 
parallel suggested in figure 0. In any case, the formulas we are going to find 
for the inverse problem are basically identical (although, perhaps, a little more 
general) than those proposed in the usual literaturtpj 
Then (see the illustration at the right of figure 111): 

(i) on the model parameter space {X, £) , one passes from the prior probability 
to the posterior probability measure 

Tpost = 7r prior n <p~ [jobs] ; (41) 



27 The representation of the (possibly uncertain) result of an observation as a probability 
measure is in compliance with ISO's (1993) recommendations. 

28 Assume that a random x and a random y are created according respectively to 7 r pr i o r 
and r ODS > an d that the pair {x, y} is accepted only if y = ip{x) (much as we did in section [4.1| . 
It is easy to prove (see the argument in section [XI} that when a pair {x, y} is accepted, x is 
a sample point of the measure 7r pos t = 7r pr i or n ip' 1 [t ods ] and y is a sample point of the 
measure T pos t = V?[ 7r prior] D t ods . These are exactly expressions ( |41[ | and l |42[ i. 

29 For a probabilistic formulations of the inverse problem, see Tarantola and Valette (1982), 
Menke (1989), Mosegaard and Tarantola (1995), Aster et al. (2005), or Tarantola (2005). For 
an alternative, statistical decision theory, see Evans and Stark (2002). 
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(ii) on the observable parameter space (Y, T) , one passes from the initial prob- 
ability measure r b s (representing the result of the measurements) to the prob- 
ability measure 

Tpost = ^[Tprior] H T obs (42) 

representing a refined estimation of the values of the observable parameters y . 
Finally, (in) the compatibility property (equation ([7])) nicely states that 

T-post = Vkpost] • (43) 

Let us evaluate the posterior probability 7r post [E] of some set E G £ . From 
expression (41) it follows (using, first, the definition of the intersection of mea- 
sures in equation ([2]), then, the definition of the reciprocal image of a measure 

1 f dT 

7Tpost[-E] = I —((p(x)) dn pl . ioT , (44) 



in equation (W 



where the constant n = f x ^(ip(x)) dir pi i OI must be different from zero (in 
order for 7r pos t to be a probability measure). Note that the density dr/du 
exists because r was assumed to be absolutely continuous w.r.t. the volume 
measure v . 

To evaluate the posterior probability r post [F] of a set F e T , one could 



try to start with expression ( 42 ) , but this possibility is not the most practical. 



One can rather use the compatibility relation ( 43 1 , as then (because of the 
relation Q defining the image of a measure) , 

T post [F] = TTpost^" 1 ^]] ■ (45) 

In real- life problems, the finite probabilities 7r pos t[-E] and Tpo^-P] (i.e., the 



sums in equations (44 1— (45 1 can (approximately) be evaluated using Monte 
Carlo method^"! 

As a final remark, should the prior probability measure 7r pr i or be absolutely 
continuous w.r.t. the volume measure \i (i.e., should the density d~K w \ OT l d\x 
exist), the posterior probability measure 7r post would also have a density, whose 



explicit expression would follow immediately from equation ( 44 1 : 



cfar P ost _ 1_ rf7r prior dr ^ 
dfi n d/j, dv ' 

i.e., explicitly, ^^(x) = £ ^^-(a:) %{<p{x)) . In the jargon of inverse 
theory, x i— > ^(ip(x)) is called the "likelihood function". 



30 From expression l |44| is follows that if xi,X2, is a collection of (independent) random 

sample elements of the prior probability measure 7r pr j or , and if, for every element, a random 
decision is taken to conserve or discard it with the probability of being conserved equal to 
k ^(ip(xi)) (where the positive constant k is arbitrary, excepted that it must ensure that 
the maximum attained value is < 1 ), then, the collection x\,xL, . . . , of conserved elements 
is a sample of 7r p0 st (Mosegaard and Tarantola [1995]). And (as it follows from the definition 
of image of a measure) the collection £^(^c^), (p^x^'), • ■ • is a sample of Tpost ■ 
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